Skip to content

compare-commits.sh: support both llama-bench and test-backend-ops #14392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

yeahdongcn
Copy link
Collaborator

@yeahdongcn yeahdongcn commented Jun 26, 2025

Make sure to read the contributing guidelines before submitting a PR

This is a follow-up to #14368, adding support for comparing test-backend-ops performance results between two commits.

Testing Done

Generated Tables

❯ cd /Users/yexiaodong/go/src/github.com/ggerganov/llama.cpp && python3 scripts/compare-llama-bench.py -b 1d5f25c53 -c ecd7fdb4c --tool test-backend-ops -i ./test-backend-ops.sqlite
| Backend   | Operation   | Parameters                              |   Bandwidth (GB/s) 1d5f25c53 |   Bandwidth (GB/s) xd/compare |   Speedup |
|:----------|:------------|:----------------------------------------|-----------------------------:|------------------------------:|----------:|
| Metal     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,1,1,1]   |                        28.42 |                         28.45 |      1.00 |
| Metal     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,512,1,1] |                        86.60 |                         97.14 |      1.12 |

❯ cd /Users/yexiaodong/go/src/github.com/ggerganov/llama.cpp && python3 scripts/compare-llama-bench.py -b ecd7fdb4c -c ecd7fdb4c --tool test-backend-ops -i ./test-backend-ops.sqlite                 
| Backend   | Operation   | Parameters                                                                       |   GFLOPS xd/test-backend-ops_sql |   GFLOPS xd/test-backend-ops_sql |   Speedup |
|:----------|:------------|:---------------------------------------------------------------------------------|---------------------------------:|---------------------------------:|----------:|
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=128,n=1,k=16416,bs=[8,1],nr=[4,1],per=[0,1,2,3],v=1      |                           127.90 |                           127.90 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=16416,n=1,k=128,bs=[8,1],nr=[4,1],per=[0,2,1,3],v=0      |                            33.98 |                            33.98 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=4096,n=1,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0     |                            57.91 |                            57.91 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=4096,n=2,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0     |                           115.00 |                           115.00 |      1.00 |

Generated Plot

plot

Full Logs

test-backend-ops:

root@deccddc39743:/ws# CMAKE_OPTS="-DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21" ./scripts/compare-commits.sh 40a6430eb 3ad0161af test-backend-ops -o ADD
+ commit1=40a6430eb
+ commit2=3ad0161af
+ tool=test-backend-ops
+ additional_args='-o ADD'
+ '[' test-backend-ops '!=' llama-bench ']'
+ '[' test-backend-ops '!=' test-backend-ops ']'
+ ./scripts/compare-llama-bench.py --check
+ '[' test-backend-ops = llama-bench ']'
+ db_file=test-backend-ops.sqlite
+ target=test-backend-ops
+ run_args='perf --output sql -o ADD'
+ rm -f test-backend-ops.sqlite
+ '[' -n '' ']'
+ dir=build-bench
+ git checkout 40a6430eb
Note: switching to '40a6430eb'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 40a6430eb Update README.md
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S . -DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21
++ nproc
+ cmake --build build-bench -t test-backend-ops -j 12
+ build-bench/bin/test-backend-ops perf --output sql -o ADD
+ sqlite3 test-backend-ops.sqlite
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 MUSA devices:
  Device 0: MTT S80, compute capability 2.1, VMM: yes
+ git checkout 3ad0161af
Previous HEAD position was 40a6430eb Update README.md
HEAD is now at 3ad0161af musa: apply mublas API changes
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S . -DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21
++ nproc
+ cmake --build build-bench -t test-backend-ops -j 12
+ build-bench/bin/test-backend-ops perf --output sql -o ADD
+ sqlite3 test-backend-ops.sqlite
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 MUSA devices:
  Device 0: MTT S80, compute capability 2.1, VMM: yes
+ ./scripts/compare-llama-bench.py -b 40a6430eb -c 3ad0161af --tool test-backend-ops -i test-backend-ops.sqlite
| Backend   | Operation   | Parameters                              |   Bandwidth (GB/s) xd/compare-commits |   Bandwidth (GB/s) 3ad0161af |   Speedup |
|:----------|:------------|:----------------------------------------|--------------------------------------:|-----------------------------:|----------:|
| MUSA0     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,1,1,1]   |                                  4.53 |                         4.53 |      1.00 |
| MUSA0     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,512,1,1] |                                246.99 |                       247.09 |      1.00 |

llama-bench:

❯ ./scripts/compare-commits.sh 7ae027c03 7736d6426 llama-bench -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999
+ commit1=7ae027c03
+ commit2=7736d6426
+ tool=llama-bench
+ additional_args='-m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999'
+ '[' llama-bench '!=' llama-bench ']'
+ ./scripts/compare-llama-bench.py --check
+ '[' llama-bench = llama-bench ']'
+ db_file=llama-bench.sqlite
+ target=llama-bench
+ run_args='-o sql -oe md -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999'
+ rm -f llama-bench.sqlite
+ '[' -n '' ']'
+ dir=build-bench
+ git checkout 7ae027c03
Previous HEAD position was 7736d6426 Apply suggestion from @JohannesGaessler
HEAD is now at 7ae027c03 Update README.md
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S .
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:77 (message):
  OpenMP not found
Call Stack (most recent call first):
  ggml/src/CMakeLists.txt:361 (ggml_add_cpu_backend_variant_impl)


++ nproc
+ cmake --build build-bench -t llama-bench -j 8
+ build-bench/bin/llama-bench -o sql -oe md -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999
+ sqlite3 llama-bench.sqlite
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           pp512 |        105.58 ± 1.85 |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           tg128 |          9.32 ± 1.50 |

build: 7ae027c03 (5844)
+ git checkout 7736d6426
Previous HEAD position was 7ae027c03 Update README.md
HEAD is now at 7736d6426 Apply suggestion from @JohannesGaessler
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S .
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:77 (message):
  OpenMP not found
Call Stack (most recent call first):
  ggml/src/CMakeLists.txt:361 (ggml_add_cpu_backend_variant_impl)


++ nproc
+ cmake --build build-bench -t llama-bench -j 8
+ sqlite3 llama-bench.sqlite
+ build-bench/bin/llama-bench -o sql -oe md -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           pp512 |        101.41 ± 9.71 |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           tg128 |         11.11 ± 0.99 |

build: 7736d6426 (5843)
+ ./scripts/compare-llama-bench.py -b 7ae027c03 -c 7736d6426 --tool llama-bench -i llama-bench.sqlite
| CPU                  | Model           | Test   |   t/s xd/compare-commits |   t/s 7736d6426 |   Speedup |
|:---------------------|:----------------|:-------|-------------------------:|----------------:|----------:|
| Accelerate, Apple M1 | qwen2 7B Q4_K_M | pp512  |                   105.58 |          101.41 |      0.96 |
| Accelerate, Apple M1 | qwen2 7B Q4_K_M | tg128  |                     9.32 |           11.11 |      1.19 |

@yeahdongcn yeahdongcn requested a review from Copilot June 26, 2025 11:28
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for comparing performance results from both llama-bench and test-backend-ops by introducing tool-specific database schemas, CLI argument parsing, and formatting functions. Key changes include:

  • Refactoring database field and key property definitions to support both tools.
  • Updating table queries and input file handling based on a new --tool argument.
  • Enhancing the CLI script (compare-commits.sh) to allow selection of the tool and additional arguments.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
scripts/compare-llama-bench.py Adjusts SQLite table creation, queries, and result formatting for dual tool support.
scripts/compare-commits.sh Updates argument parsing and build/run logic to handle multiple tools.

@github-actions github-actions bot added script Script related python python script changes labels Jun 26, 2025
@yeahdongcn yeahdongcn marked this pull request as ready for review June 26, 2025 11:33
@yeahdongcn
Copy link
Collaborator Author

Hi @JohannesGaessler @slaren @ggerganov I’ve merged #14368 into master. Could you please continue reviewing this one when you have a moment? Thanks!

@yeahdongcn yeahdongcn force-pushed the xd/compare-commits branch from 5c1951b to b5ea15f Compare July 5, 2025 04:15
@yeahdongcn yeahdongcn force-pushed the xd/compare-commits branch 2 times, most recently from 88b3c64 to 5e8f738 Compare July 8, 2025 02:07
@yeahdongcn yeahdongcn force-pushed the xd/compare-commits branch from 5e8f738 to 7736d64 Compare July 8, 2025 02:09
Signed-off-by: Xiaodong Ye <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes script Script related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants